A Soft Decision Tree

نویسنده

  • Hung Son Nguyen
چکیده

Searching for binary partition of attribute domains is an important task in Data Mining, particularly in decision tree methods. The most important advantage of decision tree methods are based on compactness and clearness of presented knowledge and high accuracy of classification. In case of large data tables, the existing decision tree induction methods often show to be inefficient in both computation and description aspects. The disadvantage of standard decision tree methods is also their instability, i.e., small deviation of data perhaps cause a total change of decision tree. We present the novel ”soft discretization” methods using ”soft cuts” instead of traditional ”crisp” (or sharp) cuts. This new concept allows to generate more compact and stable decision trees with high classification accuracy. We also present an efficient method for soft cut generation from large data bases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lower Bounds on Quantum Query Complexity for Read-Once Decision Trees with Parity Nodes

We introduce a complexity measure for decision trees called the soft rank, which measures how wellbalanced a given tree is. The soft rank is a somehow relaxed variant of the rank. Among all decision trees of depth d, the complete binary decision tree (the most balanced tree) has maximum soft rank d, the decision list (the most unbalanced tree) has minimum soft rank √ d, and any other trees have...

متن کامل

Bagging Soft Decision Trees

The decision tree is one of the earliest predictive models in machine learning. In the soft decision tree, based on the hierarchical mixture of experts model, internal binary nodes take soft decisions and choose both children with probabilities given by a sigmoid gating function. Hence for an input, all the paths to all the leaves are traversed and all those leaves contribute to the final decis...

متن کامل

Soft context clustering for F0 modeling in HMM-based speech synthesis

This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional ‘hard’ decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in sy...

متن کامل

On Exploring Soft Discretization of Continuous Attributes

Searching for a binary partition of attribute domains is an important task in data mining. It is present in both decision tree construction and discretization. The most important advantages of decision tree methods are compactness and clearness of knowledge representation as well as high accuracy of classification. Decision tree algorithms also have some drawbacks. In cases of large data tables...

متن کامل

Bias - variance tradeo of soft decision trees

This paper focuses on the study of the error composition of a fuzzy decision tree induction method recently proposed by the authors, called soft decision trees. This error may be expressed as a sum of three types of error: residual error, bias and variance. The paper studies empirically the tradeo between bias and variance in a soft decision tree method and compares it with the tradeo of classi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002